AITopics | support selection

Collaborating Authors

support selection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LOFT: Low-Rank Orthogonal Fine-Tuning via Task-Aware Support Selection

Zhao, Lanxin, Mishra, Bamdev, Jawanpuria, Pratik, Lin, Lequan, Shi, Dai, Gao, Junbin, Han, Andi

arXiv.org Machine LearningMay-13-2026

Orthogonal parameter-efficient fine-tuning (PEFT) adapts pretrained weights through structure-preserving multiplicative transformations, but existing methods often conflate two distinct design choices: the subspace in which adaptation occurs and the transformation applied within that subspace. This paper introduces LOFT, a low-rank orthogonal fine-tuning framework that explicitly separates these two components. By viewing orthogonal adaptation as a multiplicative subspace rotation, LOFT provides a unified formulation that recovers representative orthogonal PEFT methods, including coordinate-, butterfly-, Householder-, and principal-subspace-based variants. More importantly, this perspective exposes support selection as a central design axis rather than a byproduct of a particular parameterization. We develop a first-order analysis showing that useful adaptation supports should be informed by the downstream training signal, motivating practical task-aware support selection strategies. Across language understanding, visual transfer, mathematical reasoning, and multilingual out-of-distribution adaptation, LOFT recovers principal-subspace orthogonal adaptation while gradient-informed supports improve the efficiency-performance trade-off under matched parameter, memory, and compute budgets. These results suggest that principled support selection is an important direction for improving orthogonal PEFT.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2605.11872

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Theoretical Linear Convergence of Unfolded ISTA and Its Practical Weights and Thresholds

Xiaohan Chen, Jialin Liu, Zhangyang Wang, Wotao Yin

Neural Information Processing SystemsFeb-14-2026, 16:01:04 GMT

We introduce a weight structure that is necessary for asymptotic convergence to the true sparse signal.

artificial intelligence, convergence, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

60c97bef031ec312b512c08565c1868e-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 23:57:59 GMT

alist, hyperlist, nullx, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning

Bouadi, Mohamed, Seth, Pratinav, Tanna, Aditya, Sankarapu, Vinay Kumar

arXiv.org Machine LearningDec-2-2025

Tabular data drive most real-world machine learning applications, yet building general-purpose models for them remains difficult. Mixed numeric and categorical fields, weak feature structure, and limited labeled data make scaling and generalization challenging. To this end, we introduce Orion-Bix, a tabular foundation model that combines biaxial attention with meta-learned in-context reasoning for few-shot tabular learning. Its encoder alternates standard, grouped, hierarchical, and relational attention, fusing their outputs through multi-CLS summarization to capture both local and global dependencies efficiently. A label-aware ICL head adapts on the fly and scales to large label spaces via hierarchical decision routing. Meta-trained on synthetically generated, structurally diverse tables with causal priors, Orion-Bix learns transferable inductive biases across heterogeneous data. Delivered as a scikit-learn compatible foundation model, it outperforms gradient-boosting baselines and remains competitive with state-of-the-art tabular foundation models on public benchmarks, showing that biaxial attention with episodic meta-training enables robust, few-shot-ready tabular learning. The model is publicly available at https://github.com/Lexsi-Labs/Orion-BiX .

bi-axial attention, dataset, orion-bix, (14 more...)

arXiv.org Machine Learning

2512.00181

Country:

Europe > Austria > Vienna (0.14)
North America > Canada > British Columbia > Vancouver (0.05)
Oceania > Palau (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Theoretical Linear Convergence of Unfolded ISTA and Its Practical Weights and Thresholds

Xiaohan Chen, Jialin Liu, Zhangyang Wang, Wotao Yin

Neural Information Processing SystemsNov-20-2025, 20:11:48 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, convergence, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-2-2025, 20:19:03 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. Overview: The paper proposes a framework for enforcing structure in Bayesian models via structured prior selection based on the maximum entropy principle. Although the optimal prior may not be tractable, the authors developed an approximation method using submodule optimization. Contructing priors with structured variables is an important topic, so this method should be able to make good impact. Quality The paper is technically sound.

artificial intelligence, constraint, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

Add feedback

A Empirical Observations on Superlinear Convergence

Neural Information Processing SystemsAug-14-2025, 19:37:22 GMT

We set λ = 0 .05 in the experiment. We then reconstruct the image signal by ˆ f = T ˆ x.

alist, hyperlist, nullx, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Learned ISTA with Error-based Thresholding for Adaptive Sparse Coding

Li, Ziang, Wu, Kailun, Guo, Yiwen, Zhang, Changshui

arXiv.org Artificial IntelligenceDec-19-2023

Also, it leads to poor generalization to which utilizes a function of the layer-wise reconstruction error test data with a different distribution (or sparsity) from the to suggest a specific threshold for each observation in the training data. To address the above issues, we propose an shrinkage function of each layer. We show that the proposed error-based thresholding (EBT) mechanism of LISTA-based EBT mechanism well disentangles the learnable parameters models to improve their adaptivity. EBT introduces a function in the shrinkage functions from the reconstruction errors, endowing of the evolving estimation error to provide each threshold the obtained models with improved adaptivity to possible in the shrinkage functions in the model. It has no extra learnable data variations. With rigorous analyses, we further show parameter compared with original LISTA-based models, that the proposed EBT also leads to a faster convergence on yet shows significantly better performance.

lista, sparsity, support selection, (12 more...)

arXiv.org Artificial Intelligence

2112.10985

Country:

Europe > Russia > Southern Federal District > Republic of Kalmykia > Elista (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Georgia > Chatham County > Savannah (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Hyperparameter Tuning is All You Need for LISTA

Chen, Xiaohan, Liu, Jialin, Wang, Zhangyang, Yin, Wotao

arXiv.org Machine LearningOct-29-2021

Learned Iterative Shrinkage-Thresholding Algorithm (LISTA) introduces the concept of unrolling an iterative algorithm and training it like a neural network. It has had great success on sparse recovery. In this paper, we show that adding momentum to intermediate variables in the LISTA network achieves a better convergence rate and, in particular, the network with instance-optimal parameters is superlinearly convergent. Moreover, our new theoretical results lead to a practical approach of automatically and adaptively calculating the parameters of a LISTA network layer based on its previous layers. Perhaps most surprisingly, such an adaptive-parameter procedure reduces the training of LISTA to tuning only three hyperparameters from data: a new record set in the context of the recent advances on trimming down LISTA complexity. We call this new ultra-light weight network HyperLISTA. Compared to state-of-the-art LISTA models, HyperLISTA achieves almost the same performance on seen data distributions and performs better when tested on unseen distributions (specifically, those with different sparsity levels and nonzero magnitudes).

alista, convergence, hyperlista, (15 more...)

arXiv.org Machine Learning

2110.159

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Theoretical Linear Convergence of Unfolded ISTA and Its Practical Weights and Thresholds

Chen, Xiaohan, Liu, Jialin, Wang, Zhangyang, Yin, Wotao

Neural Information Processing SystemsDec-31-2018

In recent years, unfolding iterative algorithms as neural networks has become an empirical success in solving sparse recovery problems. However, its theoretical understanding is still immature, which prevents us from fully utilizing the power of neural networks. In this work, we study unfolded ISTA (Iterative Shrinkage Thresholding Algorithm) for sparse signal recovery. We introduce a weight structure that is necessary for asymptotic convergence to the true sparse signal. With this structure, unfolded ISTA can attain a linear convergence, which is better than the sublinear convergence of ISTA/FISTA in general cases. Furthermore, we propose to incorporate thresholding in the network to perform support selection, which is easy to implement and able to boost the convergence rate both theoretically and empirically. Extensive simulations, including sparse vector recovery and a compressive sensing experiment on real image data, corroborate our theoretical results and demonstrate their practical usefulness. We have made our codes publicly available.

artificial intelligence, lista, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
North America > United States > Texas (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback